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Abstract. We introduce a randomized iterative fragmentation procedure for finite metric 
spaces, which is guaranteed to result in a polynomially large subset that is D-equivalent 
to an ultrametric, where D £ (2, oo) is a prescribed target distortion. Since this procedure 
works for D arbitrarily close to the nonlinear Dvoretzky phase transition at distortion 2, we 
thus obtain a much simpler probabilistic proof of the main result of ^Sj , answering a question 
from [12] , and yielding the best known bounds in the nonlinear Dvoretzky theorem. 

Our method utilizes a sequence of random scales at which a given metric space is frag- 
mented. As in many previous randomized arguments in embedding theory, these scales 
are chosen irrespective of the geometry of the metric space in question. We show that our 
bounds are sharp if one utilizes such a "scale-oblivious" fragmentation procedure. 



1. Introduction 

A metric space (X, d) is said to embed into Hilbert space with distortion D ^ 1 if there 
exists /: X —)■ £2 satisfying ^ \\f{x) — f{y)\\2 ^ Dd{x,y) for all x,y & X . Dvoretzky's 

theorem [H] asserts that for every G N and D > 1 there exists n = n{k, D) E N such that 
every n-dimensional normed space has a /c- dimensional hnear subspace that embeds into 
Hilbert space with distortion D; see [HI [131 US] for the best known bounds on n{k, D). 

Motivated by a possible analogue of Dvoretzky's theorem in the class of general metric 
spaces, Bourgain, Figiel and Milman introduced in [6] the nonlinear Dvoretzky problem, which 
asks for the largest integer k = k{n, D) such that any ra-point metric space has a subset of car- 
dinality k that embeds into Hilbert space with distortion D. They showed [6J that for every 
D > 1 we have lim„^oo k{n, D) = 00, thus establishing the validity of a nonlinear Dvorezky 
phenomenon. Quantitatively, the main result of [B] asserts that k{n,D) ^ c{D)logn, and 
that there exists Dq > 1 for which /c(n, Dq) = 0(logr;,). 

Renewed interest in the nonlinear Dvorezky problem due to the discovery of applications 
to the theory of online algorithms resulted in a sequence of works [TTl [5l [1] which culminated 
in the following threshold phenomenon from [3j (see also [21 HJ [8] for related results): 

Theorem 1.1 ([3J). For D > 1 there exist a{D),A{D) e (0, 00) and b{D),B{D) G (0,1) 
with the following properties: 

(1) If D G (1,2) then any n-point metric space has a subset of cardinality ^ a{D)\ogn 
that embeds with distortion D into Hilbert space. On the other hand, there exist 
arbitrarily large n-point metric spaces with the property that any Y C X„ that 
embeds into Hilbert space with distortion D necessarily satisfies \Y\ ^ A{D) logn. 

(2) If D G (2, 00) then any n-point metric space has a subset of cardinality ^ n^^^^^) that 
embeds with distortion D into Hilbert space. On the other hand, there exist arbitrarily 
large n-point metric spaces X„ with the property that any Y C X„ that embeds into 
Hilbert space with distortion D necessarily satisfies \Y\ ^ n^~^^^\ 
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Note that the first assertion of part (1) of Theorem 11.11 is just a restatement of the 
Bourgain-Figiel-Milman nonlinear Dvoretzky theorem [6]. 

All the positive embedding results quoted above are actually stronger than embeddings 
into Hilbert space: they produce subsets which embed with distortion D into an ultrametric. 
Recall that a metric space {U, p) is an ultrametric if p{u,v) ^ max{p{u,w), p{w,v)} for 
every u,v,w G U. Separable ultrametrics embed isometrically into Hilbert space [17J, so 
the problem of finding a subset of a metric space which embeds with distortion D into an 
ultrametric is a (strictly) stronger statement than the nonlinear Dvoretzky problem. The 
fact that the embeddings of Theorem 11.11 are into ultrametrics is crucial for its applications. 
Note, however, that the impossibility results in Theorem 11.11 rule out embeddings into Hilbert 
space, and not just embeddings into ultrametrics. 

The proofs of the nonlinear Dvoretzky theorems in [6[ [TTl El [H |2l HI |8] proceed via de- 
terministic constructions. In [12] a new approach to the nonlinear Dvoretzky problem was 
introduced, based on a probabilistic argument which is closer in spirit to the proofs of the 
classical Dvoretzky theorem. This randomized approach, called the method of Ramsey par- 
titions, has three main advantages. First, it leads to new algorithmic applications of the 
nonlinear Dvoretzky theorem which are very different from the applications in [TTl El H E] ; 
we shall briefly describe one of these applications in Section 11.11 Second, Ramsey partitions 
yield a major simplification of the proof of part (2) in Theorem 1 1.1 1 for sufficiently large values 
of D (this part of Theorem 1 1.1 1 is by far the most complicated part of its proof in [3]). Third, 
the bound on the exponent b{D) obtained in [l2j is asymptotically sharp as D — )■ oo, unlike 
the bound in [3], which is off by a logarithmic factor. Specifically, [12] yields b{D) < 1/D, 
which is optimal up to the imphed universal constant due to the bound B{D) > 1/D of [3\. 

An obvious question, raised in [12], suggests itself: can the randomized approach of [12j 
yield a proof Theorem 11.11 in which the target distortion D > 2 is allowed to go all the 
way down to the phase transition at 2? The main result of [12] states that for D > 2, 
any n-point metric space X has a subset Y C X with |y| ^ ^i-i28/d -yyrj^icj^ embeds with 
distortion D into an ultrametric. [12] did not attempt to optimize the constant 128 in this 
result, and indeed by a more careful analysis of the arguments of [T^ one can ensure that 
\Y\ ^ 7^i-i6/{-D-2) (^QYen this estimate can be slightly improved, but not by much). In any 
case, it is clear that these statements become vacuous for D smaller than a universal constant 
close enough to 2. Thus, the full D > 2 range of part (2) of Theorem 11.11 still required the 
use of the deterministic approach of ^J. 

It was stated in [12j that there does not seem to be a simple way to use Ramsey partitions 
to handle distortions arbitrarily close to 2. In Section [LT] we make a very simple observation 
which proves that if D < 3, then the method of Ramsey partitions cannot yield a subset Y 
as above of size tending to oo with n. Thus, in fact, it is impossible to approach the phase 
transition at 2 using Ramsey partitions. Here we present a new randomized approach, build- 
ing on the multiplicative telescoping argument of [12j, which proves the nonlinear Dvoretzky 
theorem for any distortion D > 2. Specifically, we prove the following result: 

Theorem 1.2. For every D > 2, any n-point metric space has a subset of cardinality 
which embeds with distortion D into an ultrametric. Here 6 = 0{D) G (0, 1) is the unique 
solution of the equation 
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It is elementary to check that 0{D) ^ 1 — ^ for all D > 2, and that as e \ we have 

9{2 + e) = 2iog'(i/£) +^ ( ^(^iog(i%)2'' ) • Theorem 11.21 yields a very short proof (complete details 
in 3 pages) of the the nonlinear Dvoretzky theorem for all distortions D > 2, with the best 
known bounds on the exponent 0{D). In a sense that is made precise in Section [1.21 the 
above value of 0{D) is optimal for our method. 

1.1. Approximate distance oracles and limitations of Ramsey partitions. We recall 
some terminology and results from [T2]. Fix 6 G (0, 1) and let {X,d) be an ra-point metric 
space of diameter 1. A sequence {^k}'kLo of partitions of X is called a partition tree of rate 
6 if ^0 is the trivial partition {X}, for all /c ^ ^k+i is a refinement of J^k, and each set 
in ^k has diameter at most 6'^. 

The main tool in [12] is random partition trees. Let Pr be a probability distribution over 
partition trees of rate 6. For i > consider the the random subset Y C X consisting of those 
X e A such that for all k & N the entire closed ball B{x,6''/i) is contained in the element 
of to which X belongs. Assume that each x E X falls in Y with Pr-probability at least 
n~^. Then E [|y |] ^ n^~^. Define for distinct x,y G X the random quantity p{x, y) = S''^^'y\ 
where k{x, y) is the largest integer k such that both x and y fall in the same element of ^fc. 
Then p is an ultrametric on A, and for x e A and y gY we have y) ^ d{x, y) ^ jp{x, y) 
[T2l Lem. 2.1]. Thus, on y, p is bi-Lipschitz equivalent to the original metric d with distortion 
^ i/6. But more is true: the ultrametric p is defined on all of A, and approximates up to a 
factor ^ i/6 all distances from points of Y to all the other points of X. 

In [12j random partition trees were constructed with the desired bounds on /3 and the 
distortion i/6. It was shown in [12] that the existence of an ultrametric p on A which has 
the above property of approximating distances from points of a large subset F C A to all 
other points of A, has a variety of implications to the theory of data structures. Here we 
need to briefly recall the connection to approximate distance oracles. 

An ri-point metric space (A, d) can be thought of as table of (2) numbers, corresponding 
to the distances between all unordered pairs x,y & X. In the approximate distance oracle 
problem the goal is, given D > 1, to do "one time work" (preprocessing) that produces a 
data structure (called an approximate distance oracle) of size o{n^) such that given a "query" 
x,y G X, one can quickly produce a number E{x, y) satisfying d{x^ y) ^ £'(x, y) ^ Dd{x, y). 
We call D the stretch of the approximate distance oracle. 

The seminal work on approximate distance oracles is due to Thorup and Zwick |T6], who 
showed that for all odd D e N one can design a data structure of size 0{Dn^^'^^^) using which 
one can compute in time 0{D) a number E{x,y) satisfying d{x,y) ^ E{x,y) ^ Dd{x,y). 
In [T2] it was showi£| that if every n-point metric space (A, d) admits an ultrametric p 
(defined on all of A) and a subset Y C X with \Y\ ^ n^~^/^ ^ such that for every x G A 
and ?/ G F we have d{x,y) ^ p{x,y) ^ Dd{x,y), then any n-point metric space can be 
preprocessed to yield a data structure of size 0{n^'^''^^) using which one can compute in 



"'^This assertion is not stated explicitly in [T^], but it follows directly from the proof of [TH Th. 1.2]: using 
the notation of T^, as noted in the proof of [HI Th. 1.2], the ultrametric pj is only required to be defined, 
and satisfy the conclusion of [12^ Lem. 4.2], on Xj^i and not on all of X. This property is guaranteed by 
our assumption. Thus there is no loss of constant factor since for the purpose of [12^ Th. 1.2] (unlike other 
applications of 12, Lem. 4.2] in [12]), we do not need to use [T2j Lem. 4.1]. 
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time 0(1) a number E{x,y) satisfying d{x,y) ^ E{x,y) ^ Dd{x,y). A key new point here 
is that the query time is a universal constant, and does not depend on D as in |16j. 

It was also shown in [TBj that any approximate distance oracle that answers distance 
queries with stretch D < 3 must use > bits of storage. Combining this lower bound with 
the above construction of [12], we see that if D < 3 there must exist arbitrarily large n-point 
metric spaces dn) such that if p is an ultrametric on X„ and Y C X„ is such that 
d{x,y) ^ p{x,y) ^ Dd{x,y) for all x E Xn and y eY, then \Y\ < n°^^\ It is actually not 
difficult to unravel the arguments of [161 [12] to give a direct proof of the fact that Ramsey 
partitions cannot yield the nonlinear Dvoretzky theorem for distortions in (2,3). We will 
not do so here since it would be a digression from the topic of the present paper; the purpose 
of the above discussion is only to explain why a method other than Ramsey partitions is 
required in order to to go all the way down to distortion 2. 

1.2. The fragmentation procedure and admissible exponents. Having realized that 
a proof of part (2) of Theorem II. II for D arbitrarily close to 2 cannot produce a large F C X 
and an ultrametric p that is defined on all of X and satisfies d{x, y) ^ p{x, y) ^ Dd{x, y) for 
all X G X and y eY is natural to try to design a procedure which results in an ultrametric 
that is defined on the subset Y alone. This is what our fragmentation procedure does. 
In order to state our main results, we require the following definition: 

Definition 1.3 (Admissible exponent). Fix D > 2. We say that a > is an admissible 
exponent for D if there exist a sequence of (not necessarily independent) random variables 
I = Tq ^ ri ^ r2 ^ ■ ■ . > with lim„_^oor„ = 0, such that for every real number r > 0, we 
have 



2r 1 

„ < r ^ r„ J 



n=l 



D 



^ a. (1) 



Theorem 1.4 (Ultrametrics via admissible exponents). Fix D > 2, and let a > be an 

admissible exponent for D . Let X = {X, d) be a finite metric space. Then there exists a subset 
S of X of cardinality \S\ ^ |X|-'^~'^ which embeds with distortion D into an ultrametric. 

Let cr*{D) denote the infimum of those a > which are admissible exponents for D. Due 
to Theorem 1 1.4| we would like to estimate a*{D). In fact, it turns out that we can compute 
it exactly; the following theorem, in combination with Theorem II. 4| implies Theorem 11.21 

Theorem 1.5 (Optimization of admissible exponents). For every D > 2 we have (y*{D) = P, 
where (3 E (0, 1) is the unique solution of the equation 

^ = m-P)'^- (2) 
Moreover, (J*{D) is attained at the following random variables: rg = 1, and for n E'H, 

r„ = (1 - /3) /3 , (3) 

where U is a random variable that is uniformly distributed on the interval [0,1]. For this 
choice o/ 1 = ro ^ ri ^ r2 ^ . . . > 0, the supremum of the left hand side of ([T]) over r > 
equals the value of {3 in ([2]). 

The construction of the subset S in Theorem 11.41 is most natural to describe in the con- 
text of compact metric spaces, though it will be applied here only to finite metric spaces. 
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Throughout this paper a metric probability space {X, d, jj) is a compact metric space (X, d) 
equipped with a Borel probabihty measure /i. For x G X and r ^ we shall use the stan- 
dard notation for (closed) balls: B{x,r) = {y & X : d{x,y) ^ r}. To avoid degeneracies we 
assume that for every r > we have fj,{B{x,r)) > 0, and that the function x i— )■ fj,{B{x,r)) 
is measurable. Of course, these hypotheses are automatic in the case of finite metric spaces 
with uniform measure. 

Fix a metric probability space {X,d,fi), normalized to have diameter 2, and a decreasing 
sequence of radii 1 = ro ^ ri ^ r2 ^ . . . > 0. Define inductively a decreasing sequence of 
random subsets X = 5*0 ^ 5*1 ^ 5*2 ... as follows. Having defined Si, let {xn}'^=i be an i.i.d. 
sequence of points in X, each distributed according to /i. The set S'j+i is defined to be those 
points X & Si for which the first point in the sequence {xn}^=i that fell in B (x, r„ + ^^^^) , 
actually fell in the smaller ball i?(x,r„). Letting 5* = IJi^o'^i' argue (see Lemma [2?T] and 
Lemma \2.2\\ that {S,d) embeds with distortion D into an ultrametric, and that, 

So far we did not use the fact that the radii {r„}^Q are themselves random. The additional 
randomness allows us to use a refinement of an idea of ^2] in order to control the infinite 
product appearing in (jl]) using Jensen's inequality (the corresponding step in [12] used the 
AM-GM inequality). This is how the notion of admissible exponent appears in Theorem II. 4 j 
the details appear in Section [21 Note that the proof of Theorem 11.21 is simple to describe: it 
follows the above outline with the specific sequence of random radii given in (Observe 
that this sequence of radii involves a choice of only one random number U, unlike the 
construction of [12], and its predecessors [TJ [10], in which r„ was uniformly distributed on 
[8-'^/4,8-"/2], and the {r„}^^ g were independent random variables.) 

The obvious weakness of the above approach is that the random radii {r„}5^Q are chosen 
without consideration of the particular geometry of the metric space X. It makes sense that 
in order to obtain sharper results one would need to investigate how different scales in X 
interact, and reflect this understanding in a choice of radii which are not "scale-oblivious". 
Theorem 11.51 shows that in order to improve our bounds in Theorem 11.21 one would need 
to use a fragmentation procedure that is not scale-oblivious (or, find a way to control an 
expression such as (jl]) without using Jensen's inequality; this seems quite difficult). 

A particular question of interest in this context is as follows: for D > 2 let 6*{D) be 
the supremum of those > such that there exists uq & N for which any metric space 
of cardinality n ^ no has a subset of size ^ that embeds with distortion D into an 
ultrametric. Both p] and our new proof give the bound 6*{2 + e) e/ log(2/e) (for different 
reasons). Must it be the case that 6* {2 + e) tends to as e \ 0? This is of course related 
the unknown behavior of the nonlinear Dvoretzky problem at distortion D = 2. Computing 
the value of lim sup^,.^^^^ D{1 — 6*{D)) is also of interest; due to Theorem 11.51 we know that 
using our scale-oblivious metric fragmentation procedure we cannot bound this number by 
less than 2e. 

Acknowledgements. We thank Manor Mendel for helpful discussions on the Thorup-Zwick 
lower bound. A. N. is supported by NSF grants CCF-0635078 and CCF-0832795, BSF grant 
2006009, and the Packard Foundation. T. T. is supported by a grant from the MacArthur 
foundation, by NSF grant DMS-0649473, and by the NSF Waterman award. 
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2. Randomized fragmentation 
We begin with a lemma that fragments a metric space at a single pair of scales R > r > 0. 

Lemma 2.1 (Fragmentation lemma). Let {X,d,^) be a metric probability space, and let 
S X be a compact subset of X. Fix R > r > and a Borel-measurable non-negative 
function w : 5 — )■ [0, oo). Then there exists a compact subset T ^ S with 

I ^ — ^w{x) dfi{x) ^ f w{x) dfi{x), (5) 

such that T can be partitioned as T = IJ^^i^",; where each (possibly empty) Tn is compact 
and contained in a ball of radius r, and any two non-empty Tn, Tm are separated by a distance 
of at least R — r. 

Proof. We use the probabilistic method. Let {xn}'^=i be an i.i.d. sequence of points in X, 
selected using the measure /i. Observe that as B{x, R) has positive measure for all x G X, we 
will almost surely have Xn € B{x, R) for at least one n G N. Thus if we define the (random) 
quantity 

n{x) inf {n eN : Xn e B{x,R)}, (6) 
then n{x) is finite for almost every x G X, and x h-> n{x) is a measurable function of x. 
Define a (random) subset A C S* by 

A'^ {x & S : n{x) < oo A Xn(x) ^ B{x,r)}. (7) 

Then A = IJ^^i ^n, where 

An'= {x E S : n{x) = n A x„ G B{x,r)}. (8) 

By definition we have An C B{xn, r). Also, if x G v4„ and y G Am for some 1 ^ n < m, then 
by the definitions IQ, ([8]) we have d{xn,x) ^ r and d{xn,y) > R, and hence by the triangle 

inequality we have d{x,y) > R — r. Thus if we set T„ =^ An, then T„ and are compact 
and separated by a distance of at least R — r (this shows that only finitely many of the T„ 

are non-empty). If we define T =^ U^^i ^n, then T is a compact subset of S. 

Since T ^ A, in order to conclude the proof of Lemma 12.11 it suffices to prove the identity 

fi{B{x,R)) 



E 



-w{x) dfi{x) 



w{x) dfi{x). (9) 



By the Fubini-Tonelli theorem, in order to prove (Q it suffices to show that for all x G 5 we 
have. 

Since n{x) is finite almost surely, the definition ([7]), together with the joint independence of 
xi,X2 ■ ■ ■, immediately implies that: 

oo 

Pr[x G A] = ^ Pr [xn G B{x, r) A Xi, . . . , Xn-i ^ B{x, R)] 

mw^-^ = 

KB{x,R)y 



n=l 



Y,KB{x,r)) (1 - ^.{B{x,R))r-' = ill] 



n=l 
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This proves (1T0|) , and thus concludes the proof of Lemma I2.1[ □ 

We can iterate Lemma 12.11 as follows. 

Lemma 2.2 (Iterated fragmentation lemma). Fix R,D > 0. Let {X,d,ii) be a metric 
probability space of diameter at most 2R, and let 

-R = To ^ ri ^ r2 ^ . . . > 

be a sequence of radii converging to zero. Then there exists a compact subset S of X such 
that 

and {S, d) embeds with distortion D into an ultrametric. 

Proof. By applying Lemma 12.11 repeatedly, we obtain a decreasing sequence of compact 
subsets of X, 

X = So^ Si^ ... 

satisfying for n ^ 1, 

such that for each n G N we have Sn = U^i ^n,j, where each Snj is compact and contained 
in a ball of radius r„, and if Snj, Sn/ 7^ then d{Sn,j, Sn/) ^ 2r„_i/Z}. It follows inductively 
that 

and in particular 

If we set 5* =^ fl^i '^'n, then 5* is compact and obeys f|T2|) . 

If x, y G S* are distinct, let n(a;, |/) be the largest integer n such that for all m G {1, . . . , n} 
there is j(m) G N for which x,y & Sm,j{m)- Note that since the diameter of Smj is at most 
2rm, and limm^oo '"m = 0, such an n must exist. Now define an ultrametric p on S* by 

p{x,y) = 2rn(x,y)- 

It is immediate to check that p is symmetric, and obeys the ultratriangle inequality 

Vx, y,z E S, p{x, z) ^ max{p(x, z), p{y, z)}. 

If x,y E S are distinct and n = n{x,y), then by definition x,y E Snj for some j G N 
and X G Sn+i,k, y G Sn+i/, where k i. Thus d{x,y) ^ diam(S'nj) ^ 2r„ = p{x,y) and 
y) ^ ^n+i^^) ^ 2rn/D = p{x, y)/D. It follows that the identity map from {S, d) 

to (5*, p) has distortion at most D, completing the proof of Lemma [2.21 □ 

Now suppose that (X, d) is a finite metric space, and that p is the counting measure on 
X. Then Lemma [2.21 specializes to 
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Corollary 2.3 (Iterated fragmentation lemma, finite case). Fix D,R> 0. Let X = {X,d) 
be a finite metric space of diameter at most 2R, and let 



(13) 



i? = ro ^ ri ^ r2 ^ . . . > 
be a sequence of radii converging to zero. Then there exists a subset S of X such that 

\q\ > V^TT \B{x,rn)\ 

and [S, d) embeds with distortion D into an ultrametric. 

The condition ( 1T3|) is difficult to work with. However, using a random choice of r„, and 
Jensen's inequality, one can obtain a more workable condition in terms of the notion of 
admissible exponent as in Definition 11.31 This is contained in Theorem II. 4[ which we are 
now in position to prove. 



Proof of Theorem \1.4\ By rescaling we may assume that X has diameter at most 2. We let 
rQ,ri, ... be the random variables in Definition II. 3^ i.e., ([T]) holds for all r > 0. Applying 
Corollary 12.31 we thus obtain a (random) subset S C X obeying (fT3|) . which embeds with 
distortion D into an ultrametric. Taking expectations we obtain 



and hence by Jensen's inequality 

E[|5|]^5^exp(E 



\B(x,rn) 



\B{x,rn)\ 



.n=l 



|5(a;,r„ + ^) 



(14) 



For every x G X let = ti{x) < t2{x) < . . . < tk(x){x) be the radii at which \B{x, t)\ jumps, 
i.e., 1 = \B{x,ti{x))\ < \B{x,t2{x))\ < ... < \B{x,tk(x)ix))\ = \X\, and B{x,t) = B{x,tj{x)) 
if tj{x) ^ t < tj^i{x) (where we use the convention tfc(a:)+i(3;) = oo). Note that for every 
random variable r ^ we have the following simple identity: 

k{x) 

E[log|5(a;,r)|] = "^Pt [tj{x) ^ r < tj+i{x)]log\B{x,tj{x))\ 
i=i 

k{x) 

= (Pr [r ^ tj{x)] - Pr [r ^ tj+i{x)])\og\B{x,tj{x))\ 
i=i 



Pr [r ^ tj{x)] log 



i=2 



\B{x,tj_i{x))\ 



(15) 



Applying ( IT^ to r = r„ and r = r„ + ^ , we see that (III]) can be written as 



E[\S\] 



k{x) 



xGX \ j=2 



vn=l 



r„ < tj{x) ^ r„ + 



2rn-i 
D 



log 



\B{x, tj{x))\ 
\B{x,tj-i{x)) 



(16) 
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Applying ([T]) we conclude that 



-a^log (- 

j=2 ^ 



IX 



l-a 



xex 



where we used the fact that \B{x,ti{x))\ = 1 and \B{x,tk(x){x))\ = The proof of 

Theorem 11.41 is complete. □ 



3. Proof of Theorem 11.51 



1-/3 



Define / : [0, 1] ^ [0, 1] by = ^(1 - l3)~ , where /(O) = and /(I) = 1. Note that 
(log /)'(/?) = — log(l — f3), and therefore / is strictly increasing on [0, 1]. It follows that 
for each a G [0, 1] there is a unique (3 = /3(a) satisfying the identity 

« = /3(l_/3)¥. (17) 

Fix D > 2 and set /3 = P{2/D). Let ?7 be a random variable that is uniformly distributed 
on [0, 1]. We shall define a sequence of random variables ro ^ ri ^ r2 ^ . . . > as in ([3]), 
i.e, by setting tq = 1, and for n eN, 

def U+n-l 

r„ = (1 - /3) ^ . 

Writing a = 2/D, ioi every r > we have the following bound on the left hand side of (P): 

2r. 



n=l 



'n-1 

D 



oo oo 

E[ , . u + n-l , . U + n-l , . U + n-2 1 — -v 

Pr [(1 - P)^- < r ^ (1 - /3)^^ + a(l - /3)^^J = ^ Pr [t/ G /„] , (1^ 



n=l 

where is the interval: 



n=l 



dcf 



/31ogr /31ogr ( 1 + 

— n + 1, - — ; — — n + 1 — 



log(l-/3) 



log(l - /3) 



log(l - 13) 



def 



[a — n,b — n]. 



The identity flTTl) implies that b — a = (3. In particular, 6 — a ^ 1, and hence the intervals 
{In}'^=i are disjoint, at most two of them intersect [0, 1], and the total length of the inter- 
section of U^^i with [0, 1] is at most b — a. Combined with ( fTSi) . this observation implies 
that 

« length I ( U /„ I n |0, 1| U 6 - a = f). 



n=l 



vn=l 



This proves the second assertion of Theorem ll.5[ It remains to prove that for all D > 2 we 
have cr*{D) ^ (3{2/D). To this end let {r„}5^g be a sequence of random variables decreasing 
to zero as in Definition II. 3^ so that ([T]) holds for some a > 0. Our goal is to show that 

For a,p G (0, 1) denote 



/3p(a) = inf 



def . , (1 + OixY - 1 



x>l X'P — \ 



(19) 
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By homogeneity, for every ?/ ^ x > we have {x + ayY — ^ /3p(a) {y^ — x^). Thus for all 
n e N, 



EE 

n=l 



rn + 



2r. 



n-l 



E 



where we used the fact that lim, 
Now, 



n— >oo ' n 



_n=l 

and To = 1. 



/3p 



(20) 



n=l 



oo 



n=l 



pr 



r„ H ^ r 



E 



n=l 



p-i 



Pr 



2r„_i 
D 



Pr [r„ r]\ dr 



+ • (21) 



By combining ( 120|) and ( 1211) (which hold for all j9 G (0, 1)), we see that the bound a ^ P{2/D) 
will be proven if we manage to show that for all a G (0, 1), 

limsup/3p(a) ^ /3(a), (22) 

where /3{a) is the unique (3 G (0, 1) satisfying (ITTj) . 

To prove ([22]), define / : [0, cx)) ^ M by /(x) = (1 + ax)P - 1 - /3p{a) {xP - 1). Note 
that / takes only non- negative values, due to the definition f|T9l) . By considering the limit 
as X — 7- oo of the right hand side of f|T9|) . we see that /3p(a) ^ a^. But, it cannot be the case 
that f3p{a) = a^, since otherwise f{x) = (1 + axY — {axy — (1 — a^), which, since p G (0, 1), 
tends to —(1 — a^) < as x — )■ cxd, contradicting the non- negativity of / on [1,cxd). Thus 
f3p{a) < aP . It follows in particular that the infimum in ( fT9l) is actually a minimum, i.e., 
there exists xq G (1,C)o) for which /3p(a) = -^^-^^^y"^- This is the same as /(xq) = 0, and 
since / is non-negative, xq must be a global minimum of /, and hence /'(xq) = 0. 

From = /'(xq) = pa(\ + axo)''^^ — pfip[(x)x^Q~^ we see that 

1 

Xo 



(a//3p(a)) 



i/{i-p) 



(23) 



Substituting this value of xq into the equation /(xq) = 0, we see that 



(a//3,(«))^/(^-^) 
«//3,(a))^/(^-^) - a 



(«//3p(«))'/('-^) - a 



(24) 



Denote (3 = limsupp_j.o /3p(a). If /3 = 1 then (122|) holds trivially. We may therefore assume 

that /3 < 1. Moreover, (!23|) combined with xq > 1 implies that {a/ /3p{a))^^^^^^^ — a < 1, or 
f3p{a) > ■ Thus f3 ^ (all that we will need below is that /3 7^ 0). 

If {Pk}'kLi — (O5 1) is such that limk^oo Pk = and lim^^oo /3pfc(tt) = f3, then it follows 
from ([21D that: 



Pk ( log ( ^ 



a 



log I j 



+ o{pk 



a 



-Pk(3 log ( - a ) + o{pk). 



(25) 
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Since, as argued above, ^, ^ — a G (0, oo), the asymptotic identity ( 125|) implies that 

which simplifies to give a = — (3)^^~^'^^^. Since we already argued (in the paragraph 
preceding (fTTl)). that /3(a) is the unique solution of the equation ( fTTl) . we deduce that 
/9 = /3(a). The proof of ( l22l) . and hence also the proof of Theorem 11.51 is complete. □ 
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